Bayesian Clustering with Priors on Partitions
نویسنده
چکیده
Traditional clustering algorithms are deterministic in the sense that a given dataset always leads to the same output partition. This paper modifies traditional clustering algorithms whereby data is associated with a probability model, and clustering is carried out on the stochastic model parameters rather than the data. This is done in a principled way using a Bayesian approach which allows the assignment of posterior probabilities to output partitions. In addition, the approach incorporates prior knowledge of the output partitions using Bayesian melding. The methodology is applied to two substantive problems; (1) a question of stylometry involving a simulated dataset and (2) the assessment of potential champions of the 2010 FIFA World Cup.
منابع مشابه
Online Video Segmentation by Bayesian Split-Merge Clustering
We present an online video segmentation algorithm based on a novel nonparametric Bayesian clustering method called Bayesian Split-Merge Clustering (BSMC). BSMC can efficiently cluster dynamically changing data through split and merge processes at each time step, where the decision for splitting and merging is made by approximate posterior distributions over partitions with Dirichlet Process (DP...
متن کاملNonparametric Bayesian Co-clustering Ensembles
A nonparametric Bayesian approach to co-clustering ensembles is presented. Similar to clustering ensembles, coclustering ensembles combine various base co-clustering results to obtain a more robust consensus co-clustering. To avoid pre-specifying the number of co-clusters, we specify independent Dirichlet process priors for the row and column clusters. Thus, the numbers of rowand column-cluster...
متن کاملBayesian Sample size Determination for Longitudinal Studies with Continuous Response using Marginal Models
Introduction Longitudinal study designs are common in a lot of scientific researches, especially in medical, social and economic sciences. The reason is that longitudinal studies allow researchers to measure changes of each individual over time and often have higher statistical power than cross-sectional studies. Choosing an appropriate sample size is a crucial step in a successful study. A st...
متن کاملOn a class of random probability measures with general predictive structure
In this paper we investigate a recently introduced class of nonparametric priors, termed generalized Dirichlet process priors. Such priors induce (exchangeable random) partitions which are characterized by a more elaborate clustering structure than those arising from other widely used priors. A natural area of application of these random probability measures is represented by species sampling p...
متن کاملBayesian Inference for Spatial Beta Generalized Linear Mixed Models
In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...
متن کامل